Process for automatic dynamic reloading of data flow processors (dfps) and units with two- or three-dimensional programmable cell architectures (fpgas, dpgas, and the like)

ABSTRACT

In a data-processing method, first result data may be obtained using a plurality of configurable coarse-granular elements, the first result data may be written into a memory that includes spatially separate first and second memory areas and that is connected via a bus to the plurality of configurable coarse-granular elements, the first result data may be subsequently read out from the memory, and the first result data may be subsequently processed using the plurality of configurable coarse-granular elements. In a first configuration, the first memory area may be configured as a write memory, and the second memory area may be configured as a read memory. Subsequent to writing to and reading from the memory in accordance with the first configuration, the first memory area may be configured as a read memory, and the second memory area may be configured as a write memory.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. patent application Ser. No.10/265,846, filed Oct. 7, 2002, which is a continuation of U.S. patentapplication Ser. No. 09/613,217, filed Jul. 10, 2000, now U.S. Pat. No.6,477,643, which is a continuation of U.S. patent application Ser. No.08/947,002 filed on Oct. 8, 1997, now U.S. Pat. No. 6,088,795, expresslyincorporated herein by reference in the entirety.

FIELD OF THE INVENTION

The present invention is directed to a process for automatic dynamicreloading of data flow processors.

BACKGROUND INFORMATION

Programmable units presently used (DFPs, FPGAs—Field Programmable GateArrays) can be programmed in two different ways:

one-time only, i.e., the configuration can no longer be changed afterprogramming. All configured elements of the unit perform the samefunction over the entire period during which the application takesplace.

on site, i.e., the configuration can be changed after the unit has beeninstalled by loading a configuration file when the application isstarted. Most units (in particular FPGA units) cannot be reconfiguredduring operation. For reconfigurable units, data usually cannot befurther processed while the unit is being reconfigured, and the timerequired is very long.

Configuration data is loaded into programmable units through a hardwareinterface. This process is slow and usually requires hundreds ofmilliseconds due to the limited band width accessing the external memorywhere the configuration data is stored, after which the programmableunit is available for the desired/programmed function as described inthe configuration file.

A configuration is obtained by entering a special bit pattern of anydesired length into the configurable elements of the unit. Configurableelements can be any type of RAM cells, multiplexers, interconnectingelements or ALUs. A configuration string is stored in such an element,so that the element preserves its configuration determined by theconfiguration string during the period of operation.

The existing methods and options present a series of problems, such as:

If a configuration in a DFP (see German Patent Application No. DE 44 16881 A1) or an FPGA is to be modified, a complete configuration file mustalways be transmitted to the unit to be programmed, even if only a verysmall part of the configuration is to be modified.

As a new configuration is being loaded, the unit can only continue toprocess data to a limited extent or not at all.

With the increasing number of configurable elements in each unit (inparticular in FPGA units), the configuration files of these units alsobecome increasingly large (several hundred Kbytes on average). Thereforeit takes a very long time to configure a large unit and often makes itimpossible to do it during operation or affects the function of theunit.

When a unit is partially configured during operation, a central logicentity is always used, through which all reconfigurations are managed.This requires considerable communication and synchronization resources.

SUMMARY OF THE INVENTION

The present invention makes it possible to reconfigure a programmableunit considerably more rapidly. The present invention allows differentconfigurations of a programmable unit to be used in a flexible mannerduring operation without affecting or stopping the operability of theprogrammable unit. Unit configuration changes are performedsimultaneously, so they are rapidly available without need foradditional configuration data to be occasionally transmitted. The methodcan be used with all types of configurable elements of a configurableunit and with all types of configuration data, regardless of the purposefor which they are provided within the unit.

The present invention makes it possible to overcome the staticlimitations of conventional units and to improve the utilization ofexisting configurable elements. By introducing a buffer storage device,a plurality of different functions can be performed on the same data.

In a programmable unit, there is a plurality of ring memories, i.e.,memories with a dedicated address control, which, upon reaching the endof the memory, continues at the starting point, thus forming a ring.These ring memories have read-write access to configuration registers,i.e., the circuits that receive the configuration data, of the elementsto be configured. Such a ring memory has a certain number of records,which are loaded with configuration data by a PLU as described in GermanPatent Application No. 44 16 881 A1. The architecture of the records isselected so that their data format corresponds to the configurableelement(s) connected to the ring memory and allows a valid configurationto be set.

Furthermore, there is a read position pointer, which selects one of thering memory records as the current read record. The read positionpointer can be moved to any desired position/record within the ringmemory using a controller. Furthermore there is a write positionpointer, which selects one of the ring memory records as the currentwrite record. The write position pointer can be moved to any desiredposition/record within the ring memory using a controller.

At run time, to perform reconfiguration, a configuration string can betransmitted into the element to be configured without the data requiringmanagement by a central logic or transmission. By using a plurality ofring memories, several configurable elements can be configuredsimultaneously.

Since a ring memory with its complete controller can switch configurablecells between several configuration modes, it is referred to as aswitching table.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates a schematic architecture of a ring memory.

FIG. 2 illustrates the internal architecture of a ring memory.

FIG. 3 illustrates a ring memory with a selectable work area.

FIG. 4 illustrates a ring memory and a controller capable of working ondifferent ring memory sections using several read and write positionpointers.

FIG. 5 illustrates a ring memory where different controllers accessdifferent sections.

FIG. 6 illustrates a ring memory and its connection to the configurableelements.

FIG. 7 illustrates the controller with a logic for responding todifferent trigger signals; a) implementation of the trigger pulse mask.

FIG. 8 illustrates the clock generator for the controller.

FIG. 9 illustrates the wiring of the controller and the internal cellsallowing the configurable elements to be configured.

FIG. 10 illustrates the processing by the controller of the commandsstored in the ring memory.

FIG. 11 illustrates the processing of the data stored in the ringmemory.

FIG. 12 illustrates the connection of a buffer comprising two memoryarrays to a set of configurable elements.

FIG. 12 a shows a step in the data processing sequence.

FIG. 12 b shown another step in the data processing sequence.

FIG. 12 c shown another step in the data processing sequence.

FIG. 12 d shown another step in the data processing sequence.

FIG. 13 illustrates the connection of a buffer with separate read/writepointers to a set of configurable elements.

FIG. 14 illustrates the operation of a buffer with separate read/writepointers.

FIG. 15 illustrates the connection of two buffers each comprising twomemory arrays to a set of configurable elements; Figures a-c show thedata processing sequence.

DETAILED DESCRIPTION OF THE INVENTION

There is a plurality of ring memories in a programmable unit orconnected externally to said unit. The one or more ring memories haveone or more controllers controlling the one or more ring memories. Thesecontrollers are part of the PLU named in German Patent Application No.DE 44 16 881 A1. The ring memories contain configuration strings for theconfigurable elements of one or a plurality of configurable units; theconfigurable elements can also be expressly used for interconnectingfunction groups and they can be crossbar circuits or multiplexers forinterconnecting bus architectures, which are conventional.

Ring memories and ring memory controllers can be either directlyhardware-implemented or first obtained by configuring one or moreconfigurable cells of a configurable unit (e.g., FPGA).

Conventional ring memories can be used as ring memories, in particularring memories and/or controllers with the following properties:

-   -   where not all records are used, and which have the capability of        providing a position where the read and/or write position        pointer of the ring memory is set to the beginning or the end of        the ring memory. This can be implemented, for example, by using        command strings (STOP, GOTO, etc.), counters, or registers        storing the start and stop positions;    -   which make it possible to divide the ring memory into        independent sections, and the controller of the ring memory can        be set, for example, via the events listed below as examples, so        that it works on one of these sections;    -   which male it possible to divide the ring memory into        independent sections and there is a plurality of controllers,        each one working on one section; a plurality of controllers may        work on the same section. This can be implemented via arbiter        switching, in which case certain processing cycles are lost.        Registers can also be used instead of RAMs;    -   each controller has one or more read position pointers and/or        one or more write position pointers;    -   this position pointer can be moved forward and/or backward;    -   this position pointer can be set to the start, end, or a given        position on the basis of one or more events;    -   the controller has a mask register with which a subset can be        selected from the set of all possible events by entering a data        string. Only this subset of results is relayed to the controller        as an event and triggers the forwarding of the position        pointer(s);    -   controllers working with a multiple of the actual system clock        rate (oversampling) to allow the processing of several records        within a system cycle.

The switching table controller is implemented using a regular statemachine. In addition to simple controllers required by a conventionalring memory, controllers with the following properties are best suitedfor performing or possibly expanding the control of the switching tablesof a programmable unit (in particular also of FPGAs and DPGAs(Dynamically Programmable Gate Arrays, a new subgroup of FPGAs))according to the present invention:

-   -   controllers capable of recognizing specific command strings. A        command string is distinguished by the fact that it has an        identifier, which allows the controller to recognize the data of        a ring memory record as a command string rather than a data        string;    -   controllers capable of executing specific command strings;        specifically commands that change the sequence of the state        machine and/or modify records of the ring memory through a data        processing function;    -   controllers capable of recognizing an identifier and of        processing additional records of the ring memory through the        internal, higher-speed cycle (oversampling) on the basis of this        identifier, until an end identifier is reached, or the next        cycle of the clock that controls the oversampling cycle is        reached.

In particular the following commands or a subset of those commands canbe used as command strings for the appropriate control of a switchingtable requiring command string control. The command strings concerningposition pointers can be used on the read position pointer(s) or on thewrite position pointer(s). Possible command strings include:

-   -   a WAIT command.    -   The WAIT command causes the controller to wait until the next        event or (possibly several) events occur. During this state, the        read/write position pointer(s) is (are) not moved. If the        event(s) occur(s), the read/write position pointer(s) is (are)        positioned on the next record.    -   a SKIP command.    -   The SKIP command causes a given number of ring memory records to        be skipped by one of the following two methods:

The SKIP1 command is executed fully in a single processing cycle. If,for example, SKIP 5 is issued, the pointer jumps to the record locatedfive records before (after) the current read/write record in aprocessing cycle.

The SKIP2 command is only executed after a number of processing cycles.It is conceivable, for example, that the SKIP 5 command is executed onlyafter five processing cycles. Here again five records are skippedcounting from the current record. The parameter (in this case the 5) isthus used twice.

The indication of the direction of jump can end either in a forwardmovement or in a backward movement of the position pointer with the useof a positive or negative number.

-   -   A SWAP command.    -   The SWAP command swaps the data of two given records.    -   RESET command.    -   The RESET command sets the read/write position pointer(s) to the        start and/or a given record position within the ring memory.    -   A WAIT-GOTO command.    -   The WAIT-GOTO command waits like the above-described WAIT        command for one or more specific events and then positions the        read/write position pointer to a specific start state within one        or more processing cycles.    -   A NOP command.    -   The NOP command executes no action. No data is transmitted from        the ring memory to the element(s) to be configured, neither are        the position pointers modified. Thus the NOP command identifies        a record as non-relevant. However, this record is addressed and        evaluated by the ring memory controller it requires using one or        more processing cycles.    -   A GOTO command.    -   The GOTO command positions the read/write position pointer(s) on        the given record position.    -   A MASK command.    -   The MASK command writes a new data string into the multiplexer,        which selects the different events. Therefore, this command        allows the events to which the controller responds to be        changed.    -   An LLBACK command.    -   The LLBACK command generates a feedback to the PLU (as described        in German Patent Application No. DE 44 16 881 A1). The switching        table can cause greater regions of the unit to be reloaded, in        particular it can cause the switching table itself to be        reloaded.    -   A command triggering a read/modify/write cycle. The command        triggers the reading of commands or data in another record, for        example, by the controller, the PLU or an element located        outside the switching table. This data is then processed in any        desired fashion and written into the same or another position of        the switching table ring memory. This can take place during one        processing cycle of the switching table. The sequence is then        terminated before a position pointer is repositioned.

The ring memory record architecture has the following format:

Data/Command Run/Stop Data

The first bit identifies a record as a command or a data string. Thecontroller of the switching table thus decides whether the bit string inthe data portion of the record should be treated as a command or asconfiguration data.

The second bit identifies whether the controller should proceedimmediately even without the occurrence of another event, should proceedwith the next record, or wait for the next event. If an oversamplingprocess is used and the RUN bit is set, the subsequent records will beprocessed with the help of this oversampling cycle. This continues untila record without a RUN bit set has been reached or the number or recordsthat can be processed at the oversampling cycle rate within one systemcycle has been reached.

If an oversampling process is used, the normal system cycle and the RUNbit set cause commutation to take place. Events occurring during theexecution of a command sequence marked with the RUN bit are analyzed andthe trigger signal is stored in a flip-flop. The controller thenanalyzes this flip-flop again when a record without a RUN bit set isreached.

The rest of a record contains, depending on the type (data or command),all the necessary information, so that the controller can fully performits function.

The size of the ring memory can be implemented according to theapplication; this is true in particular for programmable units, wherethe ring memory is obtained by configuring one or more configurablecells.

A ring memory is connected to an element to be configured (or a group ofelements to be configured), so that a selected configuration string (inthe ring memory) is entered in the configuration register of the elementto be configured or group of elements to be configured.

Thus a valid and operational configuration of the element or group to beconfigured is obtained.

Each ring memory has one controller or a plurality of controllers, whichcontrol the positioning of the read position pointer and/or the writeposition pointer.

Using the feedback channels described in German Patent Application DE 4416 881 A1, the controller can respond to events of other elements of theunit or to external events that are transmitted into the unit (e.g.,interrupt, IO protocols, etc.) and, in response to these internal orexternal events, moves the read position pointer and/or the writeposition pointer to another record.

The following events are conceivable, for example:

-   -   clock signal of a CPU,    -   internal or external interrupt signal,    -   trigger signal of other elements within the unit,    -   comparison of a data stream and/or a command stream with a        value,    -   input/output events,    -   counter run, overrun, reset,    -   evaluation of a comparison.

If a unit has several ring memories, the controller of each ring memorycan respond to different events.

After each time the pointer is moved to a new record, the configurationstring in this record is transferred to the configurable element(s)connected to the ring memory.

This transfer takes place so that the operation of the unit parts thatare not affected by the reconfiguration remains unchanged.

The ring memory(ies) may be located either in a unit or connected to theunit from the outside via an external interface.

Each unit may have a plurality of independent ring memories, which canbe concentrated in a region of the unit, but can also be distributed ina reasonable manner on the surface of the unit.

The configuration data is loaded by a PLU, such as described in GermanPatent Application No. DE 44 16 881 A1, or by other internal cells ofthe unit into the memory of the switching table. The configuration datacan also be simultaneously transferred by the PLU or other internalcells of the unit to several different switching tables in order toallow the switching tables to load simultaneously.

The configuration data can also be in the main memory of a dataprocessing system and be transferred by known methods, such as DMA orother processor-controlled data transfer, instead of the PLU.

After the PLU has loaded the ring memory of the switching table, thecontroller of the switching table is set to a start status, whichestablishes a valid configuration of the complete unit or parts of theunit. The control of the switching table starts now with repositioningof the read position pointer and/or the write position pointer as aresponse to events taking place.

In order to cause new data to be loaded into the switching table or anumber of switching tables, the controller can return a signal to thePLU, as described in German Patent Application No. DE 44 16 881 A1, orother parts of the unit that are responsible for loading new data intothe ring memory of the switching table. Such a feedback can be triggeredby the analysis of a special command, a counter status, or from theoutside (the State-Back UNIT described in Patent Application PACT02).

The PLU or other internal cells of the unit analyze this signal, respondto the signal by executing a program possibly in a modified form, andtransfer new or different configuration data to the ring memory(ies).Only the data of each ring memory that is involved in a data transfer asdetermined by the signal analysis, rather than the configuration data ofa complete unit, must be transferred.

Buffer: A memory can be connected to individual configurable elements orgroups thereof (hereinafter referred to as functional elements). Severalknown procedures can be used to configure this memory; FIFOs arewell-known, in particular. The data generated by the functional elementsare stored in the memory until a data packet with the same operation tobe performed is processed or until the memory is full. Thereafter theconfiguration elements are reconfigured through switching tables, i.e.,the functions of the elements are changed. FullFlag showing that thememory is full can be used as a trigger signal for the switching tables.In order to freely determine the amount of data, the position of theFullFlag is configurable, i.e., the memory can also be configuredthrough the switching table. The data in the memory is sent to the inputof the configuration elements, and a new operation is performed on thedata; the data is the operand for the new computation. The data can beprocessed from the memory only, or additional data can be requested fromthe outside (outside the unit or other functional elements) for thispurpose. As the data is processed, it (the result of the operation) canbe forwarded to the next configuration elements or written into thememory again. In order to provide both read and write access to thememory, the memory can have two memory arrays, which are processedalternately, or separate read and write position pointers can exist inthe same memory.

One particular configuration option is the connection of a plurality ofmemories as described above, which allows several results to be storedin separate memories; then, at a given time, several memory regions aresent to the input of a functional element and processed in order toexecute a given function.

Architecture of a ring memory record: One possible structure of therecords in a switching table ring memory, used in a data processingsystem as described in German Patent Application No. DE 44 16 881 A1 isdescribed below. The following tables show the command architectureusing the individual bits of a command string.

Bit Number Name Description 0 Data/Command Identifies a record as a dataor command string 1 Run/Stop Identifies Run or Stop mode

Thus, if a record is a data record, bit number 0 has the value 0, so thebits from position two have the following meanings:

Bit Number Name Description 2-6  Cell number Provides the cell numberswithin a group using the same switching table 7-11 ConfigurationProvides the function that the cell (e.g., an data EALU) should execute

If the record is a command, bit number 0 has the value 1, and the bitsfrom position two have the following meanings:

Bit Number Name Description 2-6 Command Provides the number of thecommand that is number executed by the switching table controller 7Read/Write Shows whether the command is to be applied position pointerto the read position pointer or the write position pointer. If thecommand does not change either position pointer, the bit status isundefined. 8-n Data Depending on the command, the data needed for thecommand are stored starting with bit 8.

In the following table, bits 2-6 and 8-n are shown for each of thecommands listed. The overall bit length of a data string depends on theunit where the switching table is used. The bit length must be chosen soas to code all data needed for the commands in the bits starting fromposition 8.

Command Bit 2-6 Description of bit 8-n WAIT 00 00 0 Number indicatinghow often an event is to be waited for SKIP1 00 00 1 Number with plus orminus sign showing how many records are to be skipped forward (backwardif negative) SKIP2 00 01 0 See SKIP1 SWAP 00 01 1 1^(st) recordposition, 2^(nd) record position RESET 00 10 0 Number of the record onwhich the position pointer is to be set WAIT-GOTO 00 10 1 Numberindicating how often an event is to be waited for, followed by thenumber of the record on which the position pointer is to be positionedNOP 00 11 0 No function! GOTO 00 11 1 Number of the record on which theposition pointer is to be positioned MASK 01 00 0 Bit pattern enteredinto the multiplexer to select the events LLBACK 01 00 1 A triggersignal is generated for the PLU (feedback)

Reconfiguring ALUs: One or more switching tables can be used forcontrolling an ALU. The present invention can be used, for example, toimprove on Patent PACT02, where the switching table is connected to theM/F PLUREG registers or the M/F PLUREG registers are fully replaced by aswitching table.

FIG. 1 shows the schematic architecture of a ring memory. It comprises awrite position pointer 0101 and a read position pointer 0102, whichaccess a memory 0103. This memory can be configured as a RAM or as aregister. Using the read/write position pointer, an address of RAM 0104is selected, where input data is written or data is read, depending onthe type of access.

FIG. 2 shows the internal architecture of a simple ring memory. Readposition pointer 0204 has a counter 0201 and write position pointer 0205has a counter 0206. Each counter 0201, 0206 has a global reset input andan up/down input, through which the counting direction is defined. Amultiplexer 0202, whose inputs are connected to the outputs of thecounters, is used to switch between write 0205 and read 0204 positionpointers, which point to an address of memory 0203. Read and writeaccess is performed through signal 0207. The respective counter isincremented by one position for each read or write access. When the read0204 or write 0205 position pointer points at the last position of thememory (last address for an upward counting counter or first address fora downward counting counter), the read or write position pointer 0204,0205 is set to the first position of memory 0203 in the next access(first address for an upward counting counter or the last address for adownward counting counter). This provides the ring memory function.

FIG. 3 shows an extension of the normal ring memory. In this extension,counter 0303 of the write position pointer 0311 and counter 0309 of theread position pointer 0312 can be loaded with a value, so that eachaddress of the memory can be set directly. This loading sequence takesplace, as usual, through the data and load inputs of the counters. Inaddition, the work area of the ring memory can be limited to a certainsection of internal memory 0306. This is accomplished using an internallogic controlled by counters 0303, 0309 of the write/read positionpointers 0311, 0312. This logic is designed as follows: The output ofone counter 0303, 0309 is connected to the input of the respectivecomparator 0302, 0308, where the value of the counter is compared withthe value of the respective data register (0301, 0307) where the jumpposition, i.e., the end of the ring memory section, is stored. If thetwo values are the same, the comparator (0302, 0308) sends a signal tothe counter (0303, 0309), which then loads the value from the dataregister for the target address of the jump (0304, 0310), i.e., thebeginning of the ring memory section. The data register for the jumpposition (0301, 0307) and the data register for the target address(0304, 0310) are loaded by the PLU (see PACT01). With this extension, itis possible that the ring memory does not use the entire region of theinternal memory, but only a selected portion. In addition, the memorycan be subdivided into different sections when several such read/writeposition pointers (0312, 0311) are used.

FIG. 4 shows the architecture of a ring memory divided into severalsections with controller 0401 working on one of said sections. Thecontroller is described in more detail in FIG. 7. In order to allow thering memory to be divided into several sections, several read/writeposition pointers (0408, 0402), whose architecture was shown in FIG. 3,are used. The controller selects the region where it operates throughmultiplexer 0407. Read or write access is selected via multiplexer 0403.Thus the selected read/write position pointer addresses an address ofmemory 0404.

FIG. 5 shows the case where each of a plurality of controllers 0501operates in its own region of the ring memory via one read- andwrite-position pointer 0502, 0506 per controller. Each controller 0501has a write position pointer 0506 and a read position pointer 0502.Using multiplexer 0503, which of the read and write position pointers0502, 0506 accesses memory 0504 is selected. Either a read access or awrite access is selected via multiplexer 0503. The read/write signal ofcontrollers 0501 is sent to memory 0504 via multiplexer 0507. Thecontrol signal of multiplexers 0507, 0505, 0503 goes from controllers0501 via an arbiter 0508 to the multiplexers. Arbiter 0508 preventsseveral controllers from accessing multiplexers 0507, 0505, 0503simultaneously.

FIG. 6 shows a ring memory 0601 and its connection with configurationelements 0602. Ring memory 0601 is connected via lines 0604, 0605, 0606.The addresses of the addressed cells 0607 are transmitted via 0604. Line0605 transmits the configuration data from the ring memory. Via line0606, cells 0607 transmit the feedback whether reconfiguration ispossible. The data stored in the ring memory is entered in configurationelement 0602. This configuration element 0602 determines theconfiguration of configurable elements 0603. Configurable elements 0603may comprise logical units, ALUs, for example.

FIG. 7 shows a controller that may respond to different triggeringevents. The individual triggering events can be masked, so that only onetriggering event is accepted at any time. This is achieved usingmultiplexer 0701. The trigger signal is stored with flip-flop 0704.Multiplexer 0702, which can be configured as a mask via AND gates (seeFIG. 7 a), is used to process low active and high active triggeringsignals. The triggering signal stored in the flip-flop is relayed vialine 0705 to obtain a clock signal, which is described in FIG. 8. Thestate machine 0703 receives its clock signal from the logic thatgenerates the clock signal and, depending on its input signals, deliversan output signal and a reset signal to reset flip-flop 0704 and stopprocessing until the next trigger signal. The advantage of thisimplementation is the power savings when the clock is turned off, sincestate machine 0703 is then idle. An implementation where the clock ispermanently applied and the state machine is controlled by the status ofthe command decoder and the run bit is also conceivable.

FIG. 7 a shows the masking of the trigger signals. The trigger signalsand lines from A are connected to the inputs of AND gate 0706. Theoutputs of AND gate 0706 are OR-linked with 0707 to generate the outputsignal.

FIG. 8 shows the logic for generating the clock signal for the statemachine. Another clock signal is generated in 0801 with the help of aPLL. Using multiplexer 0802, the normal chip clock or the clock of PLL0801 can be selected. Signals C and B are sent to OR gate 0804. Signal Cis generated as a result of a trigger event in the controller (see FIG.7, 0705). Signal B originates from bit 1 of the command string (see FIG.10, 1012). This bit has the function of a run flag, so that thecontroller continues to operate, independently of a trigger pulse, ifthe run flag is set. The output of OR gate 0804 is AND-linked with theoutput of multiplexer 0802 to generate the clock signal for the statemachine.

FIG. 9 shows the connection between controller 0907, PLU 0902 withmemory 0901, ring memory 0906, configurable elements 0905, andconfiguration elements 0908, as well as the internal cells 0903 used forthe configuration. The internal cell 0903 used for configuration isshown here as a normal cell with configurable elements 0905 andconfiguration elements 0908. Ring memory 0906 is connected toconfiguration elements 0908 and is in turn controlled by controller0907. Controller 0907 responds to different trigger pulses, which mayalso originate from the internal cell 0903 used for configuration.Controller 0907 informs PLU 0902, via feedback channel 0909, if new datais to be loaded into ring memory 0906 due to a trigger event. Inaddition to sending this feedback, controller 0907 also sends a signalto multiplexer 0904 and selects whether data is sent from PLU 0902 orinternal cell 0903 used for configuration to the ring memory.

In addition to the configuration of the ring memory by the PLU, the ringmemory can also be set as follows: Configurable element 0903 is wired sothat it generates, alone or as the last element of a group of elements,records for ring memory 0906. It generates a trigger pulse, whichadvances the write position pointer in the ring memory. In this mode,multiplexer 0904 switches the data from 0903 through to the ring memory,while with a configuration by the PLU the data are switched through bythe PLU. It would, of course, be conceivable that additional permanentlyimplemented functional units might serve as sources of the configurationsignals.

FIG. 10 shows the processing by the controller of the commands stored inthe ring memories. 1001 represents the memory of the ring memory withthe following bit assignment. Bit 0 identifies the record as a data orcommand string. Bit 1 identifies the run and stop modes. Bits 2-6identify the command number coding the commands. Bit 7 tells whether thecommand is to be applied to the read or write position pointer. If thecommand affects no position pointer, bit 7 is undefined. The data neededfor a command is stored in bits 8-n. Counters 1004, 1005 form the writeand read position pointers of the ring memory. If the controllerreceives a trigger pulse, the state machine sends a pulse to the readposition pointer. The write position pointer is not needed to read acommand, but is only used for entering data in the ring memory. Theselected read position pointer moves forward one position, and a newcommand is selected (bit 0=0). Now bits 2-6 and bit 7 are sent tocommand decoder 1002, are decoded, and the result is relayed to thestate machine (1024), which recognizes the type of command and switchesaccordingly.

If it is a SKIP command, state machine 1011 sends a pulse toadder/subtractor 1006 so it can add/subtract the bit 8-n command stringdata to/from the data sent by counters 1004, 1005 via multiplexer 1003.Depending on bit 7, multiplexer 1003 selects the counter of writeposition pointer 1004 or the counter of read position pointer 1005.After the data has been added/subtracted, state machine 1011 activatesgate 1010 and sends a receive signal to counter 1004, 1005. Thus theselected position pointer points as many positions forward or backwardas set forth in the data of the SKIP command.

Upon a GOTO command, gate 1007 is activated by state machine 1011 sothat the data goes to read position pointer 1005 or write positionpointer 1004 and is received there.

Upon a MASK command, the data is received in a latch 1008 and storedthere. This data is then available to the controller described in FIGS.7/7 a via line A (1013) where it masks all the trigger inputs whichshould receive no trigger pulse.

Upon a WAIT command, an event is waited for as often as set forth in thedata bits. If this command is registered by state machine 1011, it sendsa pulse to wait cycle counter 1009 which receives the data. The waitcycle counter then counts one position downward for each event relayedby state machine 1011. As soon as it has counted to zero, the carry flagis set and sent to state machine 1011 (1023). The state machine thencontinues to operate due to the carry flag.

Upon a WAIT-GOTO command, the data providing the number of wait eventsis received in the wait cycle counters. After receipt of the number ofevents given in the data, the state machine activates gate 1007 andrelays the jump position data to the selected counter.

The SWAP command is used for swapping two records between two positionsof the ring memory. The address of the first record to be swapped isstored in latch 1017; the address of the second record is stored inlatch 1018. The addresses are sent to multiplexers 1015 and 1016 of theread/write pointer. Initially, record 1 is selected via 1016 and storedin latch 1019; then record 2 is selected via 1016 and stored in 1020.The write pointer is first positioned on the first record via 1015, andthe data formerly of the second record is stored via gate 1022. Then thewrite pointer is positioned on the second record via 1015 and the dataformerly of the first record is stored via gate 1021.

State machine 1011 sends feedback to the PLU via 1014 (e.g., via aState-Back UNIT, see PACT02). The state machine sends a signal via thisconnection as soon as an LLBack command is registered.

Bit 1, used as a run flag, is sent to the controller for generating aclock signal, which is described in FIG. 8.

The NOP command is registered in the state machine, but no operation isperformed.

FIG. 11 shows the processing of a data string stored in the ring memory.1101 corresponds to 1001 in FIG. 10. Since this is a data string, bit 0is set to one. Command decoder 1107 recognizes the data string as suchand sends a query 1106 to the cell addressed in bits 2-6 to verify ifreconfiguration is possible. The query is sent at the same time gate1102 is activated, which causes the address of the cell to betransmitted. The cell shows via 1105 whether reconfiguration ispossible. If so, the configuration data is transmitted to the cell viagate 1103. If no reconfiguration is possible, processing continues, andreconfiguration is attempted again in the next cycle in the ring memory.Another possible sequence would be the following: The state machineactivates gates 1102 and 1103 and transmits the data to the celladdressed. If the cell can be reconfigured, the cell acknowledgesreceipt of the data via 1105. If no configuration is possible, the celldoes not send a receive signal, and reconfiguration is attempted againin the next cycle of the ring memory.

FIG. 12 shows a group (functional element) 1202 of configurable elements1201. The data is sent to the functional element via input bus 1204, andthe results are sent forth via output bus 1205. Output bus 1205 is alsoconnected to two memory arrays 1203, which operate alternately as a reador write memory. Their outputs are connected to input bus 1204. Theentire circuit can be configured via a bus leading to switching tables1206; the trigger signals are transmitted to the switching table and theconfiguration data is transmitted from the switching table via this bus.In addition to the function of the functional element, the write/readmemory active at that time and the depth of the respective memory areset.

FIG. 12 a shows how external data 1204, i.e., data of another functionalunit or from outside the unit, is computed in the functional element1202 and then written into write memory 1210.

FIG. 12 b shows the next step after FIG. 12 a. Functional element 1202and memories 1220, 1221 are reconfigured upon a trigger generated by thefunctional element or the memories or another unit and transmitted over1206. Write memory 1210 is now configured as a read memory 1220 anddelivers the data for the functional element. The results are stored inwrite memory 1221.

FIG. 12 c shows the step following FIG. 12 b. Functional element 1202and memories 1230, 1231 were reconfigured upon a trigger generated bythe functional element or the memories or another unit and transmittedover 1206. Write memory 1221 is now configured as a read memory 1230 anddelivers the data to the functional element. The results are stored inwrite memory 1231. In this example, additional external operands 1204,i.e., from another functional unit or from outside the unit, are alsoprocessed.

FIG. 12 d shows the next step after FIG. 12 c. Functional element 1202and memories 1203, 1240 were reconfigured upon a trigger generated bythe functional element or the memories or another unit and transmittedover 1206. Write memory 1231 is now configured as a read memory 1240 anddelivers the data to the functional element. The results are forwardedvia output bus 1205.

FIG. 13 shows a circuit according to FIG. 12, where a memory withseparate read and write pointers 1301 is used instead of the two memoryarrays.

FIG. 14 shows memory 1401 according to FIG. 13. The record in front ofread pointer 1402 has already been read or is free 1405. The pointerpoints to a free record. Data 1406 still to be read are located behindthe read position pointer. A free area 1404 and data already re-written1407 follow. Write position pointer 1403 points at a free record, whichis either empty or already has been read. The memory can be configuredas a ring memory, as described previously.

FIG. 15 shows a circuit according to FIG. 12, where both memory banks1203 are present in duplicate. This makes it possible to store and thensimultaneously process a plurality of results.

FIG. 15 a shows how external data 1204, i.e., from another functionalunit or from outside the unit, is computed in functional element 1202and then written in write memory 1510 via bus 1511.

FIG. 15 b shows the next step after FIG. 15 a. Functional element 1202and memories 1203, 1510, 1520 have been reconfigured following a triggergenerated by the functional element or the memories or another unit andtransmitted over 1206. External data 1204, i.e., from another functionalunit or from outside the unit, is computed in functional element 1202and then written in write memory 1520 via bus 1521.

FIG. 15 c shows the next step after FIG. 15 b. Functional element 1202and memories 1203, 1530, 1531, 1532 have been reconfigured following atrigger generated by the functional element or the memories or anotherunit and transmitted over 1206. Write memories 1510, 1520 are nowconfigured as read memories 1531, 1532 and deliver several operandssimultaneously to functional elements 1202. Each read memory 1531, 1532is connected to 1202 via an independent bus system 1534, 1535. Theresults are either stored in write memory 1530 via 1533 or forwarded via1205.

Glossary

ALU Arithmetic Logic Unit. Basic unit for data processing. The unit canperform arithmetic operations such as addition, subtraction, andoccasionally also multiplication, division, expansions of series, etc.The unit can be configured as an integer unit of a floating-point unit.The unit can also perform logic operations such as AND, OR, as well ascomparisons. data string A data string is a series of bits, of anylength. This series of bits represents a processing unit for a system.Both commands for processors or similar components and data can be codedin a data string. DFP Data flow processor according to German Patent No.DE 44 16 881. DPGA Dynamically Configurable FPGAs. Related art. DFlip-Flop Memory element, which stores a signal at the rising edge of acycle. EALU Expanded Arithmetic Logic Unit, ALU which has been expandedto perform special functions needed or convenient for the operation of adata processing device according to German Patent Application No. DE 44116 881 A1. These are, in particular, counters. Elements Generic conceptfor all enclosed units used as a part in an electronic unit. Thus, thefollowing are defined as elements: configurable cells of all typesclusters RAM blocks logics arithmetic units registers multiplexers I/Opins of a chip Event An event can be analyzed by a hardware element inany manner suitable for the application and trigger an action as aresponse to this analysis. Thus, for example, the following are definedas events: clock pulse of a CPU internal or external interrupt signaltrigger signal from other elements within the unit comparison of a datastream and/or a command stream with a value input/output events run,overrun, reset of a counter analysis of a comparison flag Status bit ina register showing a status. FPGA Programmable logic unit. Related art.gate Group of transistors that performs a basic logic function. Basicfunctions include NAND, NOR. Transmission gates. configurable Aconfigurable element represents a component of a logic unit, which canelement be set for a special function using a configuration string.Configurable elements are therefore all types of RAM cells,multiplexers, arithmetic logic units, registers, and all types ofinternal and external interconnecting units, etc. configure Setting thefunction and interconnections of a logic unit, an FPGA cell or a PAE(see reconfigure). configuration Any set of configuration strings. dataconfiguration The configuration memory contains one or moreconfiguration strings. memory configuration A configuration stringconsists of a series of bits, of any length. This bit string seriesrepresents a valid setting for the element to be configured, so that anoperable unit is obtained. PLU Unit for configuring and reconfiguringthe PAE. Constituted by a microcontroller designed specifically for thispurpose. latch Memory element that usually relays a signal transparentlyduring the H level and stores it during the L level. Latches where thelevel function is reversed are used in some PAEs. Here an inverter isnormally connected before the cycle of a normal latch. read positionAddress of the current record for read access within a FIFO or a ringpointer memory. logic cells Cells used in DFPs, FPGAs, and DPGAs,performing simple logic and arithmetic functions, depending on theirconfiguration. oversampling A clock runs with a frequency that is amultiple of the base clock, synchronously with the same. The fasterclock is usually generated by a PLL. PLL Phase Locked Loop. Unit forgenerating a multiple of a clock frequency on the basis of a base clock.PLU Units for configuring and reconfiguring the PAE. Constituted by amicrocontroller specifically designed for this purpose. ring memoryMemory with its own read/write position pointer, which-upon reaching theend of the memory-is positioned at the beginning of the memory. Anendless ring-shaped memory is thus obtained. RS flip-flop Reset/Setflip-flop. Memory element that can be switched by two signals. writeposition Address of the current record for write access within a FIFO orring pointer memory. State-Back Unit that controls the feedback ofstatus signals to the PLU, comprising a unit multiplexer and anopen-collector bus driver element. switching A switching table is a ringmemory, which is addressed by a controller. table The records of aswitching table may contain any configuration strings. The controllercan execute commands. The switching table responds to trigger signalsand reconfigures configurable elements using a record in a ring memory.gate Switch that forwards or blocks a signal. Simple comparison: relay.reconfigure New configuration of any number of PAEs, while any remainingnumber of PAEs continue their functions (see configure). processing Aprocessing cycle describes the time required by a unit to go from acycle specific and/or valid state into the next specific and/or validstate. state machine Logic that can assume different states. Thetransition between the states depends on different input parameters.These machines are used for controlling complex functions and correspondto the related art.

Conventions

Naming Conventions

unit UNIT mode MODE multiplexer MUX negated signal not- register visibleto PLU PLUREG internal register REG shift register sft

Function Conventions

shift registersft

AND function & A B Q 0 0 0 0 1 0 1 0 0 1 1 1

OR function# A B Q 0 0 0 0 1 1 1 0 1 1 1 1

NOT function! A Q 0 1 1 0

GATE functionG EN D Q 0 0 — 0 1 — 1 0 0 1 1 1

1. A field programmable gate array (FPGA) device, comprisingconfigurable elements; a unit for configuring the configurable elements;and at least two memory units; wherein: at least some of theconfigurable elements are configurable logic elements; at least some ofthe configurable elements are configurable ALU elements comprise an ALUunit; the at least two memories store data processed by at least some ofthe configurable ALU elements.
 2. The field programmable gate arrayaccording to claim 1, wherein each of the at least some of theconfigurable elements receives its configuration data during operation.3. The field programmable gate array according to claim 1, wherein eachof the at least some of the configurable elements receives itsconfiguration data during operation from other configurable elements. 4.The field programmable gate array according to any one of claims 1, 2,and 3, wherein at least one of the memory units supports a FIFO mode. 5.The field programmable gate array according to any one of claims 1, 2,and 3, wherein at least one of the memory units supports simultaneouswrite and read access.
 6. The field programmable gate array according toany one of claims 1, 2, and 3, wherein at least one of the memory unitssupports separate write pointers and read pointers.
 7. The fieldprogrammable gate array according to claim 6, wherein at least one ofthe memory units supports simultaneous write and read access.
 8. Thefield programmable gate array according to claim 6, wherein at least oneof the memory units is configurable as read memory.
 9. The fieldprogrammable gate array according to claim 6, wherein at least one ofthe memory units is configurable as write memory.
 10. The fieldprogrammable gate array according to claim 6, wherein at least one ofthe memory units is alternatively configured as read or write memory.11. The field programmable gate array according to any one of claims 1,2, and 3, wherein at least one of the memory units supports doublebuffering.
 12. The field programmable gate array according to any one ofclaims 1, 2, and 3, wherein at least one of the memory units isconfigurable as read memory.
 13. The field programmable gate arrayaccording to any one of claims 1, 2, and 3, wherein at least one of thememory units is configurable as write memory.
 14. The field programmablegate array according to any one of claims 1, 2, and 3, wherein at leastone of the memory units is alternatively configured as read or writememory.
 15. A runtime programmable data processing device, comprising:configurable elements; a unit for configuring the configurable elements;and at least two memory units; wherein: at least some of theconfigurable elements are configurable ALU elements comprising an ALUunit; the at least two memory units store data processed by at leastsome of the configurable ALU elements; and at least some of theconfigurable elements receive configuration data during operation fromother configurable elements.
 16. The runtime programmable dataprocessing device according to claim 15, wherein at least one of theconfigurable elements is configured as a controller providingconfiguration data during the operation to at least some others of theconfigurable elements.
 17. The runtime programmable data processingdevice according to claim 15, wherein at least one of the configurableelements is configured as a sequencer providing configuration dataduring operation to at least some others of the configurable elements.18. The runtime programmable data processing device according to claim15, wherein a configurable memory integrated in the runtime programmabledata processing device is configured to provide configuration dataduring the operation to at least some others of the configurableelements.
 19. The runtime programmable data processing device accordingto claim 18, wherein the configurable memory is addressable.
 20. Theruntime programmable data processing device according to claim 19,wherein the configurable memory is cyclically addressed.
 21. The runtimeprogrammable data processing device according to any one of claims 15,16, 17, 18, 19, and 20, wherein the runtime programmable processingdevice is a Field Programmable Gate Array (FPGA).
 22. The runtimeprogrammable data processing device according to claim 15, wherein atleast one of the memory units supports a FIFO mode.
 23. The runtimeprogrammable data processing device according to claim 15, wherein atleast one of the memory units supports separate write pointers and readpointers.
 24. The runtime programmable data processing device accordingto claim 15, wherein at least one of the memory units supportssimultaneous write and read access.
 25. The runtime programmable dataprocessing device according to claim 15, wherein at least one of thememory units supports double buffering.
 26. The runtime programmabledata processing device according to claim 15, wherein at least one ofthe memory units is configurable as read memory.
 27. The runtimeprogrammable data processing device according to claim 15, wherein atleast one of the memory units is configurable as write memory.
 28. Theruntime programmable data processing device according to claim 15,wherein at least one of the memory units is alternatively configured asread or write memory.